Adapting noisy speech models - Extended uncertainty decoding
نویسندگان
چکیده
Most conventional techniques for noise adaptation assume a clean initial speech model which is adapted to a specific noise condition using adaptation data accumulated from the condition. In this paper, a different problem is considered, i.e. adapting a noisy speech model to a specific noise condition. For example, the initial noisy model may be a multi-condition model which is used to provide more accurate transcripts for the adaptation data than could be provided by a clean model, thereby obtaining a more accurate adaptation. We develop the formulation for this new problem by combining and extending maximum likelihood linear regression (MLLR), constrained MLLR (CMLLR) and uncertainty decoding techniques. We also present an implementation which has been tested on the Aurora 4 database, assuming an initial multicondition model trained using white noise corrupted data. Significant word error rate (WER) reductions are achieved in comparison with other approaches.
منابع مشابه
Uncertainty-based learning of acoustic models from noisy data
We consider the problem of acoustic modeling of noisy speech data, where the uncertainty over the data is given by a Gaussian distribution. While this uncertainty has been exploited at the decoding stage via uncertainty decoding, its usage at the training stage remains limited to static model adaptation. We introduce a new Expectation Maximisation (EM) based technique, which we call uncertainty...
متن کاملUncertainty training and decoding methods of deep neural networks based on stochastic representation of enhanced features
Speech enhancement is an important front-end technique to improve automatic speech recognition (ASR) in noisy environments. However, the wrong noise suppression of speech enhancement often causes additional distortions in speech signals, which degrades the ASR performance. To compensate the distortions, ASR needs to consider the uncertainty of enhanced features, which can be achieved by using t...
متن کاملSpeech Enhancement using Laplacian Mixture Model under Signal Presence Uncertainty
In this paper an estimator for speech enhancement based on Laplacian Mixture Model has been proposed. The proposed method, estimates the complex DFT coefficients of clean speech from noisy speech using the MMSE estimator, when the clean speech DFT coefficients are supposed mixture of Laplacians and the DFT coefficients of noise are assumed zero-mean Gaussian distribution. Furthermore, the MMS...
متن کاملReduced complexity equalization of lombard effect for speech recognition in noisy adverse environments
In real-world adverse environments, speech signal corruption by background noise, microphone channel variations, and speech production adjustments introduced by speakers in an effort to communicate efficiently over noise (Lombard effect) severely impact automatic speech recognition (ASR) performance. Recently, a set of unsupervised techniques reducing ASR sensitivity to these sources of distort...
متن کاملReplacing uncertainty decoding with subband re-estimation for large vocabulary speech recognition in noise
In this paper, we propose a novel approach for parameterized model compensation for large-vocabulary speech recognition in noisy environments. The new compensation algorithm, termed CMLLR-SUBREST, combines the model-based uncertainty decoding (UD) with subspace distribution clustering hidden Markov modeling (SDCHMM), so that the UD-type compensation can be realized by re-estimating the models b...
متن کامل